-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-6049: Misc. hygiene and code cleanup changes #1085
Conversation
Error HandlingThe column writer work introduces a new scan operator mechanism. Part of that mechanism is to standardize error handling. Part of that work is, in turn, to add a a couple new error types to UserError to better categorize errors. Star ColumnOrganized our many references to the "star column" to make it more readily available to the new scan framework. Date/Time UtilitiesThe various date/time vectors have very handy methods to convert the vector representation into Joda objects. This PR pulls that code out into a utility class that can be used in other places. This change uncovered a flaw in JDBC: that the existing date/time utilities were not included in the JDBC jar, causing failures. This PR fixes that. Deprecate OptionSetIn a previous set of PRs we've been assembling a sub-operator test framework. This PR continues that work by deprecating the prior OptionSet class. (OptionSet was meant to work around limitations in the original OptionManager, but those issues have since been resolved by all the great work that Jyothsna and Tim have done. Since OptionSet now no longer serves a useful purpose, this PR removes it.) Deprecate StatsWriterSimilarly, the StatsWriter class added a stats mechanism that can be used in sub-operator tests. Further experience has shown we can make the full OperatorStats work in this context, so StatsWriter is deprecated in this PR. Fix for Close of Sort OperatorA bit of code was reshuffled to close the operator context in the correct order in the sort operator in both production and unit tests. Swap ContainersAdded code to swap buffers between two containers. Builds on code added earlier to exchange buffers between two vectors. (Used in scan operator, to be added later.) CSV Column Header Error HandlingChanged CSV to use UserException to report errors instead of an ad-hoc HeaderError exception. Test ToolsAdd a number of services to the ClusterFixture family of test classes. The key new feature is the ability to read the results of a multi-batch query as a sequence of RowSet objects so we can use the row set tools to verify results. Union VectorThe union vector is a rather odd duck: it works, but is not widely supported in Drill. It stores its "type" members in two ways: in an internal map and in member variables. The member variable format makes it hard to treat the types generically. This PR keeps the serialized format (storing vectors in an internal map) but changes the member variable format to use an array of type vectors rather than individual variables. The result is that generic access (for the new Union column writer) is much easier. Materialized Field in VectorsVectors provide a MaterializedField to hold their metadata. The original idea seems to be that a MaterializedField is immutable. This worked great for simple scalar types. But, code evolved to change the MaterializedField for maps (to add map members), for Unions (to add union members) and so on. For container types (Map, Repeated Map, Union, List, Repeated List), the "parent" MaterializedField holds onto a copy of the child field. But, some parents, when they receive new members, throw away their old MaterializedField and create a new one. The result is that, if we have the structure map(map(a, b)), when we add b to the inner map, the parent ends up pointing to an older version of the child schema, not the updated one. This PR fixes at least some of those issues in the Union vector. JUnit UpgradeUpgrades JUnit from 4.11 to 4.12 to make use of a handy annotation that disables timeouts when debugging. Generic CleanupAdded comments, fixed indentation and other minor changes. |
*/ | ||
|
||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecated means that we might remove it in future. But since you mention that it is still ok to use in certain cases, I am not sure that such annotation is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Merge error. It is deprecated in master; I have undeprecated it in my branch...
return copy1.equals(copy2); | ||
} | ||
|
||
public static String typeKey(MinorType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this method? It looks like it is never used? If it's intended for future use please add java doc then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -54,8 +56,7 @@ | |||
static MemWatcher memWatcher; | |||
static String className; | |||
|
|||
@Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(100000); | |||
|
|||
@Rule public final TestRule TIMEOUT = new DisableOnDebug(TestTools.getTimeoutRule(100_000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great enhancement!
@Override | ||
@VisibleForTesting | ||
public long getLongStat(MetricDef metric) { | ||
return longMetrics.get(metric.metricId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If metric is absent, will it return 0 or fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably fail. But, since this used in testing to ensure that a long metric was, in fact, written, failure is actually a useful result.
@@ -38,20 +38,26 @@ | |||
import org.apache.drill.exec.server.options.SystemOptionManager; | |||
import org.apache.drill.exec.store.sys.store.provider.LocalPersistentStoreProvider; | |||
import org.apache.drill.exec.util.GuavaPatcher; | |||
import org.apache.drill.test.BaseDirTestWatcher; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting from this class, github does not highlight Java code. I remember @vdiravka had similar problem, it turned out that he copied code from text editor with some hidden symbols. Please check and fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the source. No hidden characters. For me, formatting does appear below this line.
I suspect your browser hit some limit. Large PRs make Safari struggle to display the code. Suggestion: refresh the page. search for this file, and keep going. Perhaps, if you don't expand prior files, your browser will handle the formatting. (GitHub should provide some solution for large PRs...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arina-ielchiieva, thank you for your kind review. Comments were addressed. Looks like GitHub UI prevented you from reviewing some of the files. See the suggestions inline and please try again.
*/ | ||
|
||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Merge error. It is deprecated in master; I have undeprecated it in my branch...
return copy1.equals(copy2); | ||
} | ||
|
||
public static String typeKey(MinorType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Override | ||
@VisibleForTesting | ||
public long getLongStat(MetricDef metric) { | ||
return longMetrics.get(metric.metricId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably fail. But, since this used in testing to ensure that a long metric was, in fact, written, failure is actually a useful result.
@@ -38,20 +38,26 @@ | |||
import org.apache.drill.exec.server.options.SystemOptionManager; | |||
import org.apache.drill.exec.store.sys.store.provider.LocalPersistentStoreProvider; | |||
import org.apache.drill.exec.util.GuavaPatcher; | |||
import org.apache.drill.test.BaseDirTestWatcher; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the source. No hidden characters. For me, formatting does appear below this line.
I suspect your browser hit some limit. Large PRs make Safari struggle to display the code. Suggestion: refresh the page. search for this file, and keep going. Perhaps, if you don't expand prior files, your browser will handle the formatting. (GitHub should provide some solution for large PRs...)
@paul-rogers, hope this is browser issue but just in case please take a look at UnionVector class to confirm that everything is fine. Also please resolve merge conflicts. Other than that, +1, LGTM. |
@paul-rogers since the DRILL-3993 (Calcite related changes) went into master last week, this PR would need to be rebased. I can merge it in soon after that. |
fb0f88a
to
51973b4
Compare
@amansinha100, rebased on master. Ran unit tests. Please run pre-commit tests and let me know if anything is amiss. Once I hear the tests pass, I'll mark this as ready-to-commit. Thanks. |
Sounds good. I will run the pre-commit tests. Since this PR was previously in a ready-to-commit stage and only needed a rebase, it is fine to mark it ready-to-commit. |
Collection of "hygiene" changes arising from the vector column writer project. This represents the set of changes outside of the core mechanism so that reviews can be done in multiple PRs rather than a single monster PR that mixes generic and core work.